Skip to content

docs(research): substrate-discovery via Zeta-native-AOT scoping 2026-05-03#1385

Merged
AceHack merged 2 commits intomainfrom
research/substrate-discovery-zeta-native-aot-scoping-2026-05-03
May 3, 2026
Merged

docs(research): substrate-discovery via Zeta-native-AOT scoping 2026-05-03#1385
AceHack merged 2 commits intomainfrom
research/substrate-discovery-zeta-native-aot-scoping-2026-05-03

Conversation

@AceHack
Copy link
Copy Markdown
Member

@AceHack AceHack commented May 3, 2026

Summary

Architect-within-authority decision per CLAUDE.md "don't ask permission within authority scope": yes, Zeta-native-AOT is the best long-term solution for the custom substrate index the maintainer 2026-05-03 named.

Why Zeta-native-AOT (vs alternatives)

Three load-bearing reasons:

  1. The workload IS Z-set algebra by definition. file-add / file-remove / file-modify is exactly the delta-stream IndexedZSet.fs + Incremental.fs + Operators.fs consume. Strongest match.
  2. NativeAOT validates a deployment story we already need. Cron / CI / agent-loop ticks all want fast startup. Substrate-discovery is the cheapest first try.
  3. Pre-v1 dogfooding is HIGHER leverage than deferring. Per the 2026-05-03 math-proofs assessment, core algebra is A-grade verified; the API surface needs real-world exercise.

Alternatives rejected: TS+sqlite-vec/DuckDB (faster but doesn't dogfood); live-off-the-land (punts architecture); hybrid (two systems).

Doc covers

  • Substrate types: memory + skill + agent + rule + command + BACKLOG + tick shards + research + doc cross-refs + code symbols
  • 6 query workloads with target latencies (cold-start, warm)
  • Operator mapping per query (IndexedZSet keys, joins, filters, aggregates)
  • NativeAOT deployment shape: CLI + watcher daemon + library
  • 4-phase migration plan with parallel-run retirement gates
  • 7-row risk register

Next concrete work

Phase 0 PoC (2-3 ticks): F# project skeleton + NativeAOT publish + smoke test invocation. Validates toolchain end-to-end before substantial commit.

Test plan

  • §33 archive-header lint passes
  • Memory-references lint passes (0 broken refs)
  • Composes with math-proofs assessment + IndexedZSet.fs + DbspChainRule.lean + DbspSpec.tla

🤖 Generated with Claude Code

AceHack added 2 commits May 3, 2026 07:31
…1382) + math-proofs assessment opened (#1383)

#1382 merged removing the 5 deferred .sh files (audit-memory-index-
duplicates, audit-memory-references, check-archive-header-section33,
check-no-conflict-markers, check-tick-history-order). 7 active
substrate surfaces updated (LOST-FILES-LOCATIONS.md + RESUME.md +
4 sibling-tool comments + baseline file). Closes the cleanup loop
the maintainer opened in #1371.

PR #1383 opens the honest math-proofs assessment per the maintainer
2026-05-03 ask: synthesis doc grading every formal-verification
artifact A/B/C against peer-review readiness. P0 outstanding work
identified: Lean lake-build CI job + Stryker CI + registry rows for
in-CI TLA+ specs + peer-review email draft.

Discipline lesson: honest-assessment-as-peer-review-prerequisite —
external reviewers need a grade map, not a re-verification sweep.
…05-03

Architect-within-authority decision (per CLAUDE.md "don't ask
permission within authority scope — only two real gates"): yes,
Zeta-native-AOT IS the best long-term solution for the custom
substrate index the maintainer 2026-05-03 named.

Three load-bearing reasons override the classical-PM defer-the-
dogfooding default:

1. The workload IS Z-set algebra by definition. file-add /
   file-remove / file-modify is exactly the delta-stream the IVM
   primitives consume (IndexedZSet.fs + Incremental.fs +
   Operators.fs).
2. NativeAOT validates a deployment story we already need (cron /
   CI / agent-loop ticks all want fast startup). Substrate-discovery
   is the cheapest first try.
3. Pre-v1 dogfooding is HIGHER leverage than deferring. Per the
   2026-05-03 math-proofs assessment, the core algebra is A-grade
   verified; the API surface needs real-world exercise.

Alternatives considered and rejected: TS+sqlite-vec/DuckDB (faster
but doesn't dogfood); live-off-the-land via Skill router + grep
(punts architecture); hybrid TS+Zeta (two systems, more
complexity).

Doc covers: substrate types we index (memory + skill + agent +
rule + command + BACKLOG + tick shards + research + doc cross-refs
+ code symbols), 6 query workloads with target latencies, operator
mapping per query, NativeAOT deployment shape (CLI + watcher
daemon + library), 4-phase migration plan with parallel-run
retirement gates, 7-row risk register, composes-with cross-links.

Phase 0 next concrete work: PoC validating toolchain end-to-end
(2-3 ticks; F# project + NativeAOT publish + smoke test).

Header carries §33-style Scope/Attribution/Operational status/
Non-fusion disclaimer; check-archive-header-section33.ts passes.
Operational status: research-grade.
Copilot AI review requested due to automatic review settings May 3, 2026 11:37
@AceHack AceHack enabled auto-merge (squash) May 3, 2026 11:37
@AceHack AceHack merged commit f250489 into main May 3, 2026
23 checks passed
@AceHack AceHack deleted the research/substrate-discovery-zeta-native-aot-scoping-2026-05-03 branch May 3, 2026 11:39
AceHack added a commit that referenced this pull request May 3, 2026
…ative-AOT scoping decision (#1385) (#1386)

Architect-within-authority decision corrected by maintainer 2026-05-03
ask-permission framing violation per CLAUDE.md "don't ask permission
within authority scope". Made the call: yes, Zeta-native-AOT IS the
best long-term solution.

PR #1385 lands docs/research/2026-05-03-substrate-discovery-zeta-
native-aot-scoping.md covering substrate types + 6 query workloads +
operator mapping + NativeAOT deployment shape + 4-phase migration plan
+ Phase 0 PoC as next concrete work + 7-row risk register.

Discipline lesson: architect-within-authority-scope decision-discipline
— when CLAUDE.md says you have authority, MAKE THE CALL; don't reframe
as "your call to prioritize". Save permission-asks for the two real
gates (budget-increase + permanent-WONT-DO).
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a research scoping document describing a Zeta + NativeAOT direction for “substrate-discovery” indexing, and records the work as a per-tick shard entry for 2026-05-03.

Changes:

  • Add scoping write-up for substrate indexing types, query workloads, operator mapping, and a phased migration plan targeting a Zeta NativeAOT tool.
  • Add a tick-history shard documenting the related autonomous-loop work for 2026-05-03 11:30Z.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
docs/research/2026-05-03-substrate-discovery-zeta-native-aot-scoping.md New research scoping doc for a Zeta NativeAOT-based substrate-discovery/indexing tool and migration plan.
docs/hygiene-history/ticks/2026/05/03/1130Z.md New tick shard capturing the session/tick audit trail entry.

Comment on lines +226 to +227
- `docs/research/2026-05-03-math-proofs-honest-assessment.md`
(the algebra is A-grade verified; this dogfoods it)
Comment on lines +183 to +185
`memory/**.md` only. Re-implement
`audit-memory-references.ts` + `audit-memory-index-duplicates.ts`
as Zeta queries. Run BOTH the .ts and the F# binary in CI;
Comment on lines +239 to +240
`audit-memory-index-duplicates.ts` (Phase-1 dogfood
targets — re-implement as Zeta queries)
Comment on lines +228 to +229
- `src/Core/IndexedZSet.fs` + `Incremental.fs` + `Operators.fs`
+ `ZSet.fs` (the primitives)
AceHack added a commit that referenced this pull request May 3, 2026
…ons + chat-as-assertion-channel discipline (#1388)

* docs(research)+memory: chat-is-assertion-channel discipline + substrate-discovery scoping epistemic-corrections

Three substantive corrections from maintainer 2026-05-03 chat-channel
exchange post-#1385 merge, all related to epistemic discipline:

1. **Chat is assertion-channel, not fact-channel** (new memory file +
   MEMORY.md pointer): *"when i speak i'm making assertions, that's
   the best way to describe this chat channel"*. Chat-claims need
   evidence to elevate to architectural fact. Push-back-with-evidence
   is the discipline; echo-as-fact is the failure mode.

2. **Live-off-the-land for harness-loaded surfaces is a HYPOTHESIS,
   not a fact** (#1385 scoping doc revision): the maintainer said
   "maybe", architect echoed as architectural fact. Re-graded as
   hypothesis with three falsifiable tests:
   - .claude/rules/ auto-load canary
   - skill-persona behavioral observation
   - external-PR-reviewer behavioral observation
   Phase 0 PoC scope expanded: include ONE of these tests as
   prerequisite evidence.

3. **Distribution = dual-mode (NativeAOT + self-contained JIT)**
   (#1385 scoping doc revision): maintainer 2026-05-03 *"the whole
   Zeta-native-AOT direction self contained jit is the rethink"* +
   *"we want to support both anyways, they are both useful in
   different sistuaitons"*. Both intentional support targets, not
   AOT-with-JIT-fallback. Trade-off table added: AOT for fast-startup
   contexts; JIT for reflection-heavy library-mode contexts. Phase 0
   PoC validates BOTH modes cross-platform.

Doc additionally re-graded each layer (Zeta-native-AOT canonical /
DuckDB oracle / live-off-the-land / distribution feasibility) as
fact / decision / assertion / hypothesis with evidence labels.

Composes with Otto-364 search-first-authority + razor-discipline
(no metaphysical inferences) + substrate-or-it-didn't-happen +
verify-then-claim.

§33 archive-header lint passes. Memory-index integrity passes (788
refs resolve, 0 broken).

* docs(research): substrate-discovery — match existing AOT-core-plus-JIT-plugins architecture per Zeta.Bayesian prior art

Maintainer 2026-05-03 caught the dual-mode framing reinventing
existing architecture: *"we already have a AOT core that can load
JIT plugins see the Baseyan."*

Verified prior art in repo:

- src/Bayesian/Bayesian.fsproj line 9: explicit comment "Explicitly
  NOT AOT-enforced — this is a plugin. Core stays AOT-clean."
- Project description: "Opt-in: this project doesn't enforce
  PublishAot=true because it may optionally use Infer.NET, which
  depends on reflection-emit."
- src/Core/Core.fsproj contains PluginApi.fs (IOperator<'TOut>
  plugin-author contract) + PluginHarness.fs (test harness for
  plugin operator authors)

So the architecture is:

- **Zeta.Core** = AOT-clean library with the plugin contract
- **Plugin projects** = separate fsproj, NOT AOT-enforced, can use
  reflection-heavy libraries (Infer.NET for Bayesian; future
  DuckDB.NET for the verification-oracle path; etc.)

Substrate-discovery follows this pattern:

- Core indexing/query engine ships AOT-published as a small binary
  (zero-install for external-agent use case)
- Reflection-heavy extensions (DuckDB cross-check oracle, ML
  similarity scoring) ship as separate JIT plugins loaded by the
  AOT core on demand
- The IOperator<'TOut> contract is stable across the AOT/JIT
  boundary

Phase 0 PoC scope updated:

- Build minimal Zeta.SubstrateDiscovery AOT-clean library; publish
  AOT on linux-x64, osx-arm64, win-x64
- Optionally: sibling Zeta.SubstrateDiscovery.DuckDB JIT plugin
- If AOT has compatibility issues, the rethink is narrow (extract
  the affected dependency to a JIT plugin) not wholesale
  re-architecture — because the pattern is already shipping in
  Zeta.Bayesian

§33 lint passes.

* docs(research): substrate-discovery — DST as load-bearing, not afterthought

Maintainer 2026-05-03 reminder: *"i'm sure you remember all the DST
goodness right?"* — surfaces that DST integration was buried as a
single line in the original doc instead of being treated as
load-bearing.

Adds new "DST integration" section under Distribution feasibility:

- Cold-start replay = warm-state IVM is the central correctness
  invariant (CI-enforced, not just property-tested)
- File-watcher events are adversarial schedules — DST replays them
  deterministically with pinned seed, making concurrent-modification
  / partial-write / atomic-rename quirks reproducible test cases
- Every non-determinism source must be exposed (dictionary order,
  hashtable insertion, async scheduler, plugin-load timing) and
  pinned — per Otto-281 retries are non-determinism smell
- Chain-rule Prop 3.2 Lean proof guarantees algebraic determinism;
  DST proves the implementation matches; both required for A-grade

Concrete DST primitives in Phase 0 PoC:

- Pinned random seeds (Otto-273; 69/420 whimsy)
- Replay mode (event sequence + seed → identical Z-set state)
- CI job comparing cold-start replay vs warm-state IVM at every
  commit
- Adversarial-schedule fuzz harness for pathological file-watcher
  event sequences

Composes with Otto-272 DST-everywhere + Otto-273 seed-lock-policy +
Otto-281 DST-exempt-is-deferred-bug + the chain-rule Lean proof +
the math-proofs assessment A-grade definition.

§33 lint passes.

* fix(memory/chat-assertion-channel): address review thread — yes=good consistency in discipline-check

Reviewer caught: original Discipline check questions had Q1 + Q2
phrased so 'no' was the desired answer (didn't echo, didn't encode
'maybe' as 'is') but the conclusion said 'any no = failure mode' —
internal inconsistency.

Reworded all 4 questions so 'yes' is uniformly the desired answer:
- Did I grade every chat-assertion's evidence base?
- Did I keep 'maybe' framed as 'maybe'?
- Did I document falsifiability tests?
- Did I attribute assertions to whoever asserted them?

Conclusion is now consistent: 'no' = failure mode, triggers revision.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants